skip to main content


Search for: All records

Creators/Authors contains: "Knight, Rob"

Note: When clicking on a Digital Object Identifier (DOI) number, you will be taken to an external site maintained by the publisher. Some full text articles may not yet be available without a charge during the embargo (administrative interval).
What is a DOI Number?

Some links on this page may take you to non-federal websites. Their policies may differ from this site.

  1. Greene, Casey S. (Ed.)
    ABSTRACT UniFrac is an important tool in microbiome research that is used for phylogenetically comparing microbiome profiles to one another (beta diversity). Striped UniFrac recently added the ability to split the problem into many independent subproblems, exhibiting nearly linear scaling but suffering from memory contention. Here, we adapt UniFrac to graphics processing units using OpenACC, enabling greater than 1,000× computational improvement, and apply it to 307,237 samples, the largest 16S rRNA V4 uniformly preprocessed microbiome data set analyzed to date. IMPORTANCE UniFrac is an important tool in microbiome research that is used for phylogenetically comparing microbiome profiles to one another. Here, we adapt UniFrac to operate on graphics processing units, enabling a 1,000× computational improvement. To highlight this advance, we perform what may be the largest microbiome analysis to date, applying UniFrac to 307,237 16S rRNA V4 microbiome samples preprocessed with Deblur. These scaling improvements turn UniFrac into a real-time tool for common data sets and unlock new research questions as more microbiome data are collected. 
    more » « less
  2. Abstract

    Studies using 16S rRNA and shotgun metagenomics typically yield different results, usually attributed to PCR amplification biases. We introduce Greengenes2, a reference tree that unifies genomic and 16S rRNA databases in a consistent, integrated resource. By inserting sequences into a whole-genome phylogeny, we show that 16S rRNA and shotgun metagenomic data generated from the same samples agree in principal coordinates space, taxonomy and phenotype effect size when analyzed with the same tree.

     
    more » « less
  3. Abstract Fish are the most diverse and widely distributed vertebrates, yet little is known about the microbial ecology of fishes nor the biological and environmental factors that influence fish microbiota. To identify factors that explain microbial diversity patterns in a geographical subset of marine fish, we analyzed the microbiota (gill tissue, skin mucus, midgut digesta and hindgut digesta) from 101 species of Southern California marine fishes, spanning 22 orders, 55 families and 83 genera, representing ~25% of local marine fish diversity. We compare alpha, beta and gamma diversity while establishing a method to estimate microbial biomass associated with these host surfaces. We show that body site is the strongest driver of microbial diversity while microbial biomass and diversity is lowest in the gill of larger, pelagic fishes. Patterns of phylosymbiosis are observed across the gill, skin and hindgut. In a quantitative synthesis of vertebrate hindguts (569 species), we also show that mammals have the highest gamma diversity when controlling for host species number while fishes have the highest percent of unique microbial taxa. The composite dataset will be useful to vertebrate microbiota researchers and fish biologists interested in microbial ecology, with applications in aquaculture and fisheries management. 
    more » « less
  4. Abstract

    Throughout the COVID-19 pandemic, massive sequencing and data sharing efforts enabled the real-time surveillance of novel SARS-CoV-2 strains throughout the world, the results of which provided public health officials with actionable information to prevent the spread of the virus. However, with great sequencing comes great computation, and while cloud computing platforms bring high-performance computing directly into the hands of all who seek it, optimal design and configuration of a cloud compute cluster requires significant system administration expertise. We developed ViReflow, a user-friendly viral consensus sequence reconstruction pipeline enabling rapid analysis of viral sequence datasets leveraging Amazon Web Services (AWS) cloud compute resources and the Reflow system. ViReflow was developed specifically in response to the COVID-19 pandemic, but it is general to any viral pathogen. Importantly, when utilized with sufficient compute resources, ViReflow can trim, map, call variants, and call consensus sequences from amplicon sequence data from 1000 SARS-CoV-2 samples at 1000X depth in < 10 min, with no user intervention. ViReflow’s simplicity, flexibility, and scalability make it an ideal tool for viral molecular epidemiological efforts.

     
    more » « less
  5. Abstract Background

    The Spacecraft Assembly Facility (SAF) at the NASA’s Jet Propulsion Laboratory is the primary cleanroom facility used in the construction of some of the planetary protection (PP)-sensitive missions developed by NASA, including the Mars 2020 Perseverance Rover that launched in July 2020. SAF floor samples (n=98) were collected, over a 6-month period in 2016 prior to the construction of the Mars rover subsystems, to better understand the temporal and spatial distribution of bacterial populations (total, viable, cultivable, and spore) in this unique cleanroom.

    Results

    Cleanroom samples were examined for total (living and dead) and viable (living only) microbial populations using molecular approaches and cultured isolates employing the traditional NASA standard spore assay (NSA), which predominantly isolated spores. The 130 NSA isolates were represented by 16 bacterial genera, of which 97% were identified as spore-formers via Sanger sequencing. The most spatially abundant isolate wasBacillus subtilis, and the most temporally abundant spore-former wasVirgibacillus panthothenticus. The 16S rRNA gene-targeted amplicon sequencing detected 51 additional genera not found in the NSA method. The amplicon sequencing of the samples treated with propidium monoazide (PMA), which would differentiate between viable and dead organisms, revealed a total of 54 genera: 46 viable non-spore forming genera and 8 viable spore forming genera in these samples. The microbial diversity generated by the amplicon sequencing corresponded to ~86% non-spore-formers and ~14% spore-formers. The most common spatially distributed genera wereSphinigobium,Geobacillus, andBacilluswhereas temporally distributed common genera wereAcinetobacter,Geobacilllus, andBacillus. Single-cell genomics detected 6 genera in the sample analyzed, with the most prominent beingAcinetobacter.

    Conclusion

    This study clearly established that detecting spores via NSA does not provide a complete assessment for the cleanliness of spacecraft-associated environments since it failed to detect several PP-relevant genera that were only recovered via molecular methods. This highlights the importance of a methodological paradigm shift to appropriately monitor bioburden in cleanrooms for not only the aeronautical industry but also for pharmaceutical, medical industries, etc., and the need to employ molecular sequencing to complement traditional culture-based assays.

     
    more » « less
  6. null (Ed.)
    The fish gut microbiome is impacted by a number of biological and environmental factors including fish feed formulations. Unlike mammals, vertical microbiome transmission is largely absent in fish and thus little is known about how the gut microbiome is initially colonized during hatchery rearing nor the stability throughout growout stages. Here we investigate how various microbial-rich surfaces from the built environment “BE” and feed influence the development of the mucosal microbiome (gill, skin, and digesta) of an economically important marine fish, yellowtail kingfish, Seriola lalandi , over time. For the first experiment, we sampled gill and skin microbiomes from 36 fish reared in three tank conditions, and demonstrate that the gill is more influenced by the surrounding environment than the skin. In a second experiment, fish mucous (gill, skin, and digesta), the BE (tank side, water, inlet pipe, airstones, and air diffusers) and feed were sampled from indoor reared fish at three ages (43, 137, and 430 dph; n = 12 per age). At 430 dph, 20 additional fish were sampled from an outdoor ocean net pen. A total of 304 samples were processed for 16S rRNA gene sequencing. Gill and skin alpha diversity increased while gut diversity decreased with age. Diversity was much lower in fish from the ocean net pen compared to indoor fish. The gill and skin are most influenced by the BE early in development, with aeration equipment having more impact in later ages, while the gut “allochthonous” microbiome becomes increasingly differentiated from the environment over time. Feed had a relatively low impact on driving microbial communities. Our findings suggest that S. lalandi mucosal microbiomes are differentially influenced by the BE with a high turnover and rapid succession occurring in the gill and skin while the gut microbiome is more stable. We demonstrate how individual components of a hatchery system, especially aeration equipment, may contribute directly to microbiome development in a marine fish. In addition, results demonstrate how early life (larval) exposure to biofouling in the rearing environment may influence fish microbiome development which is important for animal health and aquaculture production. 
    more » « less
  7. Mackelprang, Rachel (Ed.)
    ABSTRACT Increasing data volumes on high-throughput sequencing instruments such as the NovaSeq 6000 leads to long computational bottlenecks for common metagenomics data preprocessing tasks such as adaptor and primer trimming and host removal. Here, we test whether faster recently developed computational tools (Fastp and Minimap2) can replace widely used choices (Atropos and Bowtie2), obtaining dramatic accelerations with additional sensitivity and minimal loss of specificity for these tasks. Furthermore, the taxonomic tables resulting from downstream processing provide biologically comparable results. However, we demonstrate that for taxonomic assignment, Bowtie2’s specificity is still required. We suggest that periodic reevaluation of pipeline components, together with improvements to standardized APIs to chain them together, will greatly enhance the efficiency of common bioinformatics tasks while also facilitating incorporation of further optimized steps running on GPUs, FPGAs, or other architectures. We also note that a detailed exploration of available algorithms and pipeline components is an important step that should be taken before optimization of less efficient algorithms on advanced or nonstandard hardware. IMPORTANCE In shotgun metagenomics studies that seek to relate changes in microbial DNA across samples, processing the data on a computer often takes longer than obtaining the data from the sequencing instrument. Recently developed software packages that perform individual steps in the pipeline of data processing in principle offer speed advantages, but in practice they may contain pitfalls that prevent their use, for example, they may make approximations that introduce unacceptable errors in the data. Here, we show that differences in choices of these components can speed up overall data processing by 5-fold or more on the same hardware while maintaining a high degree of correctness, greatly reducing the time taken to interpret results. This is an important step for using the data in clinical settings, where the time taken to obtain the results may be critical for guiding treatment. 
    more » « less